Detecting and handling driving event sounds during a navigation session

ABSTRACT

To identify driving event sounds during navigation, a client device in a vehicle provides a set of navigation directions for traversing from a starting location to a destination location along a route. During navigation to the destination location, the client device identifies audio that includes a driving event sound from within the vehicle or an area surrounding the vehicle. In response to determining that the audio includes the driving event sound, the client device determines whether the driving event sound is artificial. In response to determining that the driving event sound is artificial, the client device presents a notification to the driver indicating that the driving event sound is artificial or masks the driving event sound to prevent the driver from hearing the driving event sound.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of and claims priority to U.S.application Ser. No. 17/273,673, filed Mar. 4, 2021, entitled “Detectingand Handling Driving Event Sounds During a Navigation Session,” whichclaims priority to PCT/US20/60984 filed Nov. 18, 2020, the disclosuresof each of which is incorporated herein by reference in its entirety forall purposes.

FIELD OF THE DISCLOSURE

The present disclosure relates to detecting driving event sounds and,more particularly, to preventing driver distraction by masking theeffects of artificial driving event sounds generated by electronicdevices.

BACKGROUND

The background description provided herein is for the purpose ofgenerally presenting the context of the disclosure. Work of thepresently named inventors, to the extent it is described in thisbackground section, as well as aspects of the description that may nototherwise qualify as prior art at the time of filing, are neitherexpressly nor impliedly admitted as prior art against the presentdisclosure.

Today, software applications executing in computers, smartphones, etc.or embedded devices generate step-by-step navigation directions.Typically, a user specifies the starting point and the destination, anda software application displays and/or presents the directions in anaudio format immediately and/or as the user travels from the startingpoint to the destination.

During navigation there may be many distractions to the driver. One typeof distraction may be when driving-related noises (e.g., emergencyvehicle sirens, car horns, vehicle collisions, etc.) are played withinthe vehicle. These driving-related noises may deceive the driver intothinking the noises are real noises coming from external sources ratherthan artificial sounds generated by devices within the vehicle. As aresult, the driver may react to the driving-related noises, for exampleby slowing down or pulling over unnecessarily.

SUMMARY

In some implementations, a mapping application operating within avehicle may identify driving event sounds from within the vehicle orfrom the area surrounding the vehicle. For example, the mappingapplication may identify the driving event sounds while presentingnavigation directions for assisting a driver in traversing from astarting location to a destination location. The driving event soundsmay be emergency vehicle sirens, car horns, the sounds of vehiclecollisions, etc.

The mapping application may identify the driving event sounds bycommunicating with other applications executing on a client device, forexample, via an application programming interface (API). The otherapplications executing on the client device may provide characteristicsof audio content being played by the other applications, such as anaudio stream for current or upcoming audio content or metadatadescribing the audio content (e.g., a title of the audio content, adescription of the audio content, terms, phrases, or sounds included inthe audio content, the length of the audio content, the language of theaudio content, etc.). Additionally, the mapping application maycommunicate with other devices within the vehicle (e.g., a vehicle headunit), such as via a short-range communication link. The other devicesmay also provide characteristics of audio content to the mappingapplication. Still further, the mapping application may identify thedriving event sounds by comparing audio fingerprints of predetermineddriving event sounds to ambient audio in the surrounding area.

In any event, when a driving event sound is identified, the mappingapplication may determine whether the driving event sound is real (i.e.,the driving event sound is provided from the vehicle or from an externalsource outside of the vehicle, such as another vehicle or an emergencyvehicle and requires the driver's attention) or artificial (i.e., thedriving event sound comes from an electronic device within the vehicleand does not require the driver's attention). The mapping applicationmay determine that the driving event sound is artificial if the drivingevent sound is identified from an electronic source within the vehicle,such as another application executing on the client device or anotherdevice. In other implementations, the mapping application may determinewhether the driving event sound is real or artificial by applyingcharacteristics of the driving event sound to a machine learning modeltrained to distinguish between real and artificial driving event sounds.The characteristics of the driving event sound may include audiocharacteristics of the driving event sound as well as environmentalcharacteristics at the vehicle at the time of the driving event sound,such as whether a vehicle door is open which may result in a vehicledoor alarm.

When the mapping application identifies an artificial driving eventsound, the mapping application attempts to mask the effect of thedriving event sound on the driver to substantially prevent the driverfrom being distracted by the driving event sound. For example, themapping application may display a notification on the client deviceindicating that the driving event sound is artificial and instructingthe driver to ignore it. The mapping application may also play an audionotification with similar instructions. Additionally or alternatively,the mapping application may prevent at least a portion of the drivingevent sound from being played. For example, the mapping application maymute the audio or decrease the volume on the audio during the drivingevent sound. In other implementations, the mapping application mayfilter out the audio during the driving event sound via a bandpassfilter, for example.

When the mapping application identifies a real driving event sound, themapping application may alert the driver that the sound is real so thatthe driver does not ignore the driving event sound. For example, themapping application may display a notification on the client deviceindicating that the driving event sound is real and instructing thedriver to respond appropriately. The mapping application may also playan audio notification with similar instructions.

In this manner, the mapping application may reduce the amount ofdistraction to the driver, thereby improving driver safety. Such areduction in the amount of distraction is achieved by filtering ormasking the artificial sounds that relate to driving events. Asdisclosed herein, this filtering may reduce the volume of suchartificial sounds, remove portions or all of such artificial sounds, orprovide one or more notifications to inform the driver that theartificial sound is not real. In this manner, the driver does not reactand alter their control of a vehicle (i.e. react to the sound) either bynot hearing the artificial sound or by being informed that theartificial sound is not real. As such, the mapping application disclosedherein has greatly improved safety as the influence of artificialdriving sounds on the navigation instructions is actively reduced. Themapping application may also assist the driver in identifying emergencyvehicles, vehicle collisions, or vehicle malfunctions and help thedriver respond appropriately.

One example embodiment of the techniques of this disclosure is a methodfor identifying driving event sounds during a navigation session. Themethod includes providing a set of navigation directions for traversingfrom a starting location to a destination location along a route. Duringnavigation to the destination location, the method includes identifyingaudio that includes a driving event sound from within the vehicle or anarea surrounding the vehicle. In response to determining that the audioincludes the driving event sound, the method includes determiningwhether the driving event sound is artificial. In response todetermining that the driving event sound is artificial, the methodincludes presenting a notification to the driver indicating that thedriving event sound is artificial, or masking the driving event sound toprevent the driver from hearing the driving event sound.

Another example embodiment of the techniques of this disclosure is aclient device for identifying driving event sounds. The client deviceincludes one or more processors, and a non-transitory computer-readablememory coupled to the one or more processors and storing instructionsthereon. The instructions, when executed by the one or more processors,cause the client device to identify audio that includes a driving eventsound from within the vehicle or an area surrounding the vehicle. Inresponse to determining that the audio includes the driving event sound,the instructions cause the client device to determine whether thedriving event sound is artificial. In response to determining that thedriving event sound is artificial, the instructions cause the clientdevice to present a notification to the driver indicating that thedriving event sound is artificial, or mask the driving event sound toprevent the driver from hearing the driving event sound.

Yet another example embodiment of the techniques of this disclosure isnon-transitory computer-readable memory storing instructions thereon.The instructions, when executed by one or more processors, cause the oneor more processors to identify audio that includes a driving event soundfrom within the vehicle or an area surrounding the vehicle. In responseto determining that the audio includes the driving event sound, theinstructions cause the one or more processors to determine whether thedriving event sound is artificial. In response to determining that thedriving event sound is artificial, the instructions cause the one ormore processors to present a notification to the driver indicating thatthe driving event sound is artificial, or mask the driving event soundto prevent the driver from hearing the driving event sound.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an example vehicle in which the techniques of thepresent disclosure can be used to detect driving event sounds;

FIG. 2 is a block diagram of an example system in which techniques fordetecting driving event sounds can be implemented;

FIG. 3 is a combined block and logic diagram that illustrates theprocess for identifying a driving event sound based on characteristicsof an audio stream using a first machine learning model;

FIG. 4 is a combined block and logic diagram that illustrates theprocess for identifying whether a driving event sound is real orartificial based on characteristics of the driving event sound using asecond machine learning model;

FIGS. 5A-5C are example navigation displays including notifications tothe driver in response to detecting a driving event sound;

FIG. 6 is a flow diagram of an example method for identifying drivingevent sounds during a navigation session, which can be implemented in aclient device.

DETAILED DESCRIPTION Overview

Generally speaking, the techniques for identifying driving event soundscan be implemented in one or several client devices, a vehicle headunit, one or several network servers, or a system that includes acombination of these devices. However, for clarity, the examples belowfocus primarily on an embodiment in which a client device executing amapping application obtains audio playback data from another applicationexecuting on the client device or from a device communicativelyconnected to the client device. The client device may communicate withother applications executing on the client device (e.g., via an API), orother devices within the vicinity of the client device (e.g., ashort-range communication link), such as the vehicle head unit or otherclient devices. The client device may also obtain or compute an ambientaudio fingerprint of the audio within the area. In any event, the clientdevice determines whether the audio from audio playback data or theambient audio fingerprint includes a driving event sound. Morespecifically, the client device may compare the ambient audiofingerprint or an audio fingerprint from the audio stream included inthe audio playback data to audio fingerprints of predetermined drivingevent sounds. When there is a match, the client device may determinethat the audio includes a driving event sound.

In other implementations, the client device may apply the ambient audioor the audio stream from the audio playback data, including any derivedfeatures from the ambient audio or other audio stream such as an audiofingerprint, as an input to a trained machine learning model foridentifying driving event sounds. More specifically, a server device mayhave generated the trained machine learning model by training a machinelearning model using audio features for a set of audio streams, andindications of whether or not a driving event sound corresponds to eachaudio stream. The audio streams may be classified as including a drivingevent sound or not including a driving event sound. In someimplementations, the audio streams may be classified according toparticular types of driving event sounds, such as an emergency vehiclesiren, a sound of a vehicle collision, a vehicle malfunction alarm, or avehicle horn honking. In any event, the server device may provide thetrained machine learning model to the client device. The client devicemay then continuously or periodically apply ambient audio features oraudio features from audio streams included in the audio playback data tothe trained machine learning model to identify driving event sounds.

When a driving event sound is identified, the client device maydetermine whether the driving event sound is real or artificial. Theclient device may determine that the driving event sound is artificialif the driving event sound is identified from a source within thevehicle, such as another application executing on the client device oranother device. In other implementations, the client device maydetermine that the driving event sound is artificial by comparing theambient audio fingerprint or an audio fingerprint from the audio streamincluded in the audio playback data to audio fingerprints ofpredetermined artificial driving event sounds. When there is a match,the client device may determine that the audio includes an artificialdriving event sound. The client device may also compare the ambientaudio fingerprint or an audio fingerprint from the audio stream includedin the audio playback data to audio fingerprints of predetermined realdriving event sounds. When there is a match, the client device maydetermine that the audio includes a real driving event sound. In yetother implementations, the client device may determine whether thedriving event sound is real or artificial by applying characteristics ofthe driving event sound to a machine learning model trained todistinguish between real and artificial driving event sounds. The serverdevice may have generated the trained machine learning model forclassifying driving event sounds as real or artificial by training amachine learning model using characteristics of driving event sounds andindications of whether each driving event sound is real or artificial.The characteristics of each driving event sound may include audiocharacteristics of the driving event sound as well as environmentalcharacteristics at the vehicle at the time of the driving event sound.In any event, the server device may provide the trained machine learningmodel to the client device. The client device may then applycharacteristics of a detected driving event sound as an input to thetrained machine learning model to determine whether the driving eventsound is real or artificial.

When the client device identifies an artificial driving event sound, theclient device may notify the driver that the driving event sound isartificial and may instruct the driver to ignore it. The client devicemay also mute the audio, decrease the volume of the audio, or filter thedriving event sound from the audio. When the client device identifies areal driving event sound, the client device may alert the driver thatthe sound is real so that the driver can respond appropriately.

In some implementations, the client device identifies driving eventsounds during a navigation session provided by the mapping application.For example, when a user such as the driver requests navigationdirections from a starting location to a destination location, themapping application may provide the request to a navigation data server.The navigation data server may then provide a set of navigationdirections to the client device which may be presented by the mappingapplication. While the mapping application presents the set ofnavigation directions to the user, the client device may identifydriving event sounds and mask the effects of artificial driving eventsounds on the driver. In other implementations, the client deviceidentifies driving event sounds any time the client device is within avehicle, regardless of whether there is an active navigation session.The user of the client device may request that the mapping applicationidentify driving event sounds during navigation sessions or any timewhen the user is within a vehicle. Beneficially therefore, the mappingapplication disclosed herein can improve driving safety regardless ofwhether navigation is being used or not.

Example Hardware and Software Components

Referring to FIG. 1 , an example environment 1 in which the techniquesoutlined above can be implemented includes a portable device 10 and avehicle 12 with a head unit 14. The portable device 10 may be a smartphone, a tablet computer, or an in-vehicle navigation system, forexample. The portable device 10 communicates with the head unit 14 ofthe vehicle 12 via a communication link 16, which may be wired (e.g.,Universal Serial Bus (USB)) or wireless (e.g., Bluetooth, Wi-Fi Direct).The portable device 10 also can communicate with various contentproviders, servers, etc. via a wireless communication network such as afourth- or third-generation cellular network (4G or 3G, respectively).

The head unit 14 can include a display 18 for presenting navigationinformation such as a digital map. The display 18 in someimplementations is a touchscreen and includes a software keyboard forentering text input, which may include the name or address of adestination, point of origin, etc. Hardware input controls 20 and 22 onthe head unit 14 and the steering wheel, respectively, can be used forentering alphanumeric characters or to perform other functions forrequesting navigation directions. The head unit 14 also can includeaudio input and output components such as a microphone 24 and speakers26, for example. The speakers 26 can be used to play audio instructionsor audio notifications sent from the portable device 10.

An example communication system 100 in which a driving event sounddetection system can be implemented is illustrated in FIG. 2 . Thecommunication system 100 includes a client device 10 configured toexecute a geographic application 122, which also can be referred to as“mapping application 122.” Depending on the implementation, theapplication 122 can display an interactive digital map, request andreceive routing data to provide driving, walking, or other navigationdirections including audio navigation directions, provide variousgeolocated content, etc. The client device 10 may be operated by a user(also referred to herein as a “driver”) displaying a digital map whilenavigating to various locations. The communication system 100 alsoincludes a vehicle head unit 14 which may communicate with the clientdevice 10, via a short-range communication link such as Bluetooth, Wi-FiDirect, etc. Furthermore, the communication system 100 may include othercomputing devices 92 within the vicinity of the client device 10. Forexample, when the client device 10 is a driver's smart phone, the othercomputing devices 92 may include smart phones of passengers within thevehicle 12, or a tablet or wearable device of the driver.

In addition to the client device 10, the communication system 100includes a server device 60 configured to provide trained machinelearning models to the client device 10. The server device 60 can becommunicatively coupled to a database 80 that stores, in an exampleimplementation, a first machine learning model for identifying drivingevent sounds. The training data used as a training input for the firstmachine learning model may include audio features for a set of audiostreams (i.e., characteristics of each audio stream such as frequencies,pitches, tones, amplitudes, etc.), and indications of whether or not adriving event sound is included in each audio stream. The audio streamsmay be classified as including a driving event sound or not including adriving event sound. In some implementations, the audio streams may beclassified according to particular types of driving event sounds, suchas an emergency vehicle siren, a sound of a vehicle collision, a vehiclemalfunction alarm, or a vehicle horn honking. The training data isdescribed in further detail below with reference to FIG. 3 .Additionally, the database 80 may store a second machine learning modelfor determining whether a driving event sound is real or artificial. Thetraining data used as a training input for the second machine learningmodel may include characteristics of driving event sounds andindications of whether each driving event sound is real or artificial.The training data is described in further detail below with reference toFIG. 4 .

More generally, the server device 60 can communicate with one or severaldatabases that store any type of suitable geospatial information orinformation that can be linked to a geographic context. Thecommunication system 100 also can include a navigation data server 34that provides navigation directions such as driving, walking, biking, orpublic transit directions, for example. Further, the communicationsystem 100 can include a map data server 50 that provides map data tothe server device 60 for generating a map display. The devices operatingin the communication system 100 can be interconnected via acommunication network 30.

In various implementations, the client device 10 may be a smartphone ora tablet computer. The client device 10 may include a memory 120, one ormore processors (CPUs) 116, a graphics processing unit (GPU) 112, an I/Omodule 14 including a microphone and speakers, a user interface (UI) 32,and one or several sensors 19 including a Global Positioning Service(GPS) module. The memory 120 can be a non-transitory memory and caninclude one or several suitable memory modules, such as random accessmemory (RAM), read-only memory (ROM), flash memory, other types ofpersistent memory, etc. The I/O module 114 may be a touch screen, forexample. In various implementations, the client device 10 can includefewer components than illustrated in FIG. 2 or conversely, additionalcomponents. In other embodiments, the client device 10 may be anysuitable portable or non-portable computing device. For example, theclient device 10 may be a laptop computer, a desktop computer, awearable device such as a smart watch or smart glasses, etc.

The memory 120 stores an operating system (OS) 126, which can be anytype of suitable mobile or general-purpose operating system. The OS 126can include application programming interface (API) functions that allowapplications to retrieve sensor readings. For example, a softwareapplication configured to execute on the computing device 10 can includeinstructions that invoke an OS 126 API for retrieving a current locationof the client device 10 at that instant. The API can also return aquantitative indication of how certain the API is of the estimate (e.g.,as a percentage).

The memory 120 also stores a mapping application 122, which isconfigured to generate interactive digital maps and/or perform othergeographic functions, as indicated above. The mapping application 122can receive navigation instructions including audio navigationinstructions and present the navigation instructions. The mappingapplication 122 also can display driving, walking, or transitdirections, and in general provide functions related to geography,geolocation, navigation, etc. Additionally, the mapping application 122can detect driving event sounds via the driving event sound detector134. The driving event sound detector 134 may also determine whether adetected driving event sound is real or artificial. If the detecteddriving event sound is artificial, the driving event sound detector 134may present a notification to the driver indicating that the sound isartificial and/or instructing the driver to ignore it. The driving eventsound detector 134 may also or instead mute or lower the volume of theclient device or the other device playing the driving event sound as thedriving event sound is being played. Still further, the driving eventsound detector 134 may filter an audio stream to prevent the audiostream from playing the driving event sound. If the detected drivingevent sound is real, the driving event sound detector 134 may present anotification to the driver indicating that the sound is real and/orinstructing the driver to respond appropriately (e.g., to pull over,call for help, take the vehicle in for service, etc.).

It is noted that although FIG. 2 illustrates the mapping application 122as a standalone application, the functionality of the mappingapplication 122 also can be provided in the form of an online serviceaccessible via a web browser executing on the client device as a plug-inor extension for another software application executing on the clientdevice etc. The mapping application 122 generally can be provided indifferent versions for different respective operating systems. Forexample, the maker of the client device 10 can provide a SoftwareDevelopment Kit (SDK) including the mapping application 122 for theAndroid™ platform, another SDK for the iOS™ platform, etc.

In addition to the mapping application 122, the memory 120 stores otherclient applications 132, such as music applications, video applications,gaming applications, streaming applications, radio applications, socialmedia applications, etc. which play audio content. These applications132 may expose APIs for communicating with the mapping application 122.

In some implementations, the server device 60 includes one or moreprocessors 62 and a memory 64. The memory 64 may be tangible,non-transitory memory and may include any types of suitable memorymodules, including random access memory (RAM), read-only memory (ROM),flash memory, other types of persistent memory, etc. The memory 64stores instructions executable on the processors 62 that make up adriving event sound machine learning (ML) model generator 68, which cangenerate a first machine learning model for identifying driving eventsounds and a second machine learning model for determining whether anidentified driving event sound is real or artificial.

The driving event sound ML model generator 68 and the driving eventsound detector 134 can operate as components of a driving event sounddetection system. Alternatively, the driving event sound detectionsystem can include only server-side components and simply provide thedriving event sound detector 134 with instructions to presentnotifications or adjust the audio. In other words, driving event sounddetection techniques in these embodiments can be implementedtransparently to the driving event sound detector 134. As anotheralternative, the entire functionality of the driving event sound MLmodel generator 68 can be implemented in the driving event sounddetector 134.

For simplicity, FIG. 2 illustrates the server device 60 as only oneinstance of a server. However, the server device 60 according to someimplementations includes a group of one or more server devices, eachequipped with one or more processors and capable of operatingindependently of the other server devices. Server devices operating insuch a group can process requests from the client device 10 individually(e.g., based on availability), in a distributed manner where oneoperation associated with processing a request is performed on oneserver device while another operation associated with processing thesame request is performed on another server device, or according to anyother suitable technique. For the purposes of this discussion, the term“server device” may refer to an individual server device or to a groupof two or more server devices.

In operation, the driving event sound detector 134 operating in theclient device 10 receives and transmits data to the server device 60and/or the navigation data server 34. Thus, in one example, the clientdevice 10 may transmit a communication to the navigation data server 34requesting navigation directions from a starting location to adestination. Accordingly, the navigation data server 34 may generate aset of navigation instructions and provide the set of navigationinstructions to the client device 10. The client device 10 may alsotransmit a communication to the driving event sound ML model generator68 (implemented in the server device 60) for the first machine learningmodel for identifying driving event sounds and the second machinelearning model for determining whether an identified driving event soundis real or artificial.

The client device 10 may then apply audio features to the first machinelearning model to detect driving event sounds, and may applycharacteristics of detected driving event sounds to the second machinelearning model to determine whether the driving event sounds are real orartificial.

In some embodiments, the driving event sound ML model generator 68 maygenerate a separate machine learning model for each type of drivingevent sound. For example, the driving event sound ML model generator 68may generate one machine learning model for identifying police sirens,another machine learning model for identifying fire truck sirens, yetanother machine learning model for identifying ambulance sirens, anothermachine learning model for identifying a vehicle honking, anothermachine learning model for identifying the sound of a vehicle collision,yet another machine learning model for identifying a vehicle malfunctionalarm, etc. In other implementations, the driving event sound ML modelgenerator 68 may generate a single machine learning model with adifferent output class for each type of driving event sound.

FIG. 3 schematically illustrates an example process for training a firstmachine learning model 310 for detecting driving event sounds andapplying the audio features of an audio stream to the first machinelearning model 310 to detect a driving event sound in the audio stream.As described above, the driving event sound ML model generator 68 in theserver device 60 may generate the first machine learning model 310. Thefirst machine learning model 310 may be generated using various machinelearning techniques such as a regression analysis (e.g., a logisticregression, linear regression, or polynomial regression), k-nearestneighbors, decisions trees, random forests, boosting (e.g., extremegradient boosting), neural networks (e.g., a convolutional neuralnetwork), support vector machines, deep learning, reinforcementlearning, Bayesian networks, etc. To generate the first machine learningmodel 310, the driving event sound ML model generator 68 receivestraining data including a first audio stream 302 a having a first set ofaudio characteristics 304 a (e.g., audio features), and a firstindication of whether the first audio stream 302 a includes a drivingevent sound 306 a. The first indication 306 a may also include the typeof driving event sound (e.g., an emergency vehicle siren, a sound of avehicle collision, a vehicle malfunction alarm, or a vehicle hornhonking). The training data also includes a second audio stream 302 bhaving a second set of audio characteristics 304 b, and a secondindication of whether the second audio stream 302 b includes a drivingevent sound 306 b. Furthermore, the training data includes a third audiostream 302 c having a third set of audio characteristics 304 c, and athird indication of whether the third audio stream 302 c includes adriving event sound 306 c. Still further, the training data includes annth audio stream 302 n having an nth set of audio characteristics 304 n,and an nth indication of whether the nth audio stream 302 n includes adriving event sound 306 n.

While the example training data includes four audio streams 302 a-302 n,this is merely an example for ease of illustration only. The trainingdata may include any number of audio streams and corresponding audiocharacteristics and indications of whether the audio streams include adriving event sound.

The driving event sound ML model generator 68 then analyzes the trainingdata to generate a first machine learning model 310 for detectingdriving event sounds. In some implementations, the driving event soundML model generator 68 generates a separate machine learning model foreach type of driving event sound. While the first machine learning model310 is illustrated as a linear regression model, the first machinelearning model 310 may be another type of regression model such as alogistic regression model, a decision tree, a neural network, ahyperplane, or any other suitable machine learning model.

For example, when the machine learning technique is a neural network,the driving event sound ML model generator 68 may generate a graphhaving input nodes, intermediate or “hidden” nodes, edges, and outputnodes. The nodes may represent a test or function performed on audiocharacteristics and the edges may represent connections between nodes.In some embodiments, the output nodes may include indications of whetherthe audio stream includes a driving event sound and/or the type ofdriving event sound. The indications may be likelihoods that the audiostream includes a driving event sound and/or likelihood that the audiostream includes a particular type of driving event sound.

For example, a neural network may include four inputs nodes representingaudio characteristics that are each connected to several hidden nodes.The hidden nodes are then connected to an output node that indicateswhether the audio stream includes a driving event sound. The connectionsmay have assigned weights and the hidden nodes may include tests orfunctions performed on the audio characteristics.

In some embodiments, the hidden nodes may be connected to several outputnodes each indicating a type of driving event sound. In this example,the four input nodes may include the frequency, amplitude, pitch, andtone of an audio stream. Tests or functions may be applied to the inputvalues at the hidden nodes. Then the results of the tests or functionsmay be weighted and/or aggregated to determine a likelihood that theaudio stream includes a driving event sound. When the likelihood isabove a threshold likelihood, the neural network may determine that theaudio stream includes a driving event sound.

However, this is merely one example of the inputs and resulting outputof the neural network for detecting driving event sounds. In otherexamples, any number of input nodes may include any suitable audiocharacteristics for an audio stream. Additionally, any number of outputnodes may determine likelihoods of an audio stream including a drivingevent sound or likelihoods of the audio stream including particulartypes of driving event sounds.

As additional training data is collected, the weights, nodes, and/orconnections may be adjusted. In this manner, the machine learning modelis constantly or periodically updated.

In any event, the driving event sound ML model generator 68 may providethe first machine learning model 310 to the client device 10. Then whenthe driving event sound detector 134 obtains an audio stream 314, suchas during a navigation session, the driving event sound detector 134 mayapply characteristics of the audio stream 314 as in input to the firstmachine learning model 310 to determine whether the audio stream 314includes a driving event sound 318. Such a determination can be providedas an output of the first machine learning model 310. The driving eventsound detector 134 may also determine the type of driving event sound318 using the first machine learning model 310, which may provide such adetermination as an output. For example, for a first audio streamobtained from audio playback data at another application executing onthe client device 10, the first machine learning model 310 determinesthat the first audio stream includes a driving event sound. For a secondaudio stream obtained from audio playback data at a device 14, 92communicatively connected to the client device 10, the first machinelearning model 310 determines that the second audio stream includes apolice siren. For a third audio stream obtained from ambient audiowithin the area of the vehicle 12, the machine learning model 310determines that the third audio stream does not include a driving eventsound.

As described above, machine learning is merely one example technique fordetecting a driving event sound. In other implementations, the drivingevent sound detector 134 may detect a driving event sound by comparingaudio fingerprints of predetermined driving event sounds to ambientaudio within the area of the vehicle 12 or audio streams from audioplayback data from another application executing on the client device 10or from a device 14, 92 communicatively connected to the client device10. When there is a match, the driving event sound detector 134 maydetermine that the audio includes a driving event sound. Morespecifically, the driving event sound detector 134 may extractfingerprints from the ambient audio or audio streams, identify featuresof the ambient audio or audio stream fingerprints, and may comparefeatures of the ambient audio or audio stream fingerprints to featuresof audio fingerprints from predetermined driving event sounds. Forexample, frequencies, pitches, tones, amplitudes, etc., may be stored asaudio fingerprint features. Then each of these audio fingerprintfeatures for the predetermined driving event sounds may be compared tothe features of the ambient audio or audio stream fingerprints.

In some embodiments, the audio fingerprint features for thepredetermined driving event sounds may be compared to the features forthe ambient audio or audio stream fingerprints using a nearest neighborsalgorithm. The nearest neighbors algorithm may identify audiofingerprint features for predetermined driving event sounds which arethe closest to the features of the ambient audio or audio streamfingerprints. The driving event sound detector 134 may then determinethat the ambient audio or audio stream includes a driving event soundwhen the ambient audio or audio stream fingerprint features match withor have more than a threshold amount of similarity with the audiofingerprint features for one of the predetermined driving event sounds.The driving event sound detector 134 may also determine that the ambientaudio or audio stream includes the particular type of driving eventsound in the predetermined driving event sound that matches with or hasmore than a threshold amount of similarity with the ambient audio oraudio stream fingerprints.

In yet other implementations, the driving event sound detector 134 maydetect a driving event sound based on metadata describing audio contentfrom another application executing on the client device 10 or from adevice 14, 92 communicatively connected to the client device 10. Themetadata may indicate that the audio content includes a driving eventsound, the type of driving event sound, and/or when the driving eventsound will be played.

When a driving event sound is detected, the driving event sound detector134 may determine whether the driving event sound is real or artificial.FIG. 4 schematically illustrates an example process for training asecond machine learning model 410 for identifying whether a drivingevent sound is real or artificial and applying the characteristics of adetected driving event sound to the second machine learning model 410 todetermine whether the detected driving event sound is real orartificial. As described above, the driving event sound ML modelgenerator 68 in the server device 60 may generate the second machinelearning model 410. The second machine learning model 410 may begenerated using various machine learning techniques such as a regressionanalysis (e.g., a logistic regression, linear regression, or polynomialregression), k-nearest neighbors, decisions trees, random forests,boosting (e.g., extreme gradient boosting), neural networks (e.g., aconvolutional neural network), support vector machines, deep learning,reinforcement learning, Bayesian networks, etc. To generate the secondmachine learning model 410, the driving event sound ML model generator68 receives training data including a first driving event sound 402 ahaving a first set of driving event sound characteristics 404 a, and afirst indication of whether the first driving event sound 402 a is realor artificial 406 a.

The driving event sound characteristics may include audiocharacteristics of the driving event sound as well as environmentalcharacteristics at the vehicle 12 at the time of the driving eventsound. The audio characteristics of the driving event sound may includefrequencies, pitches, tones, amplitudes, wavelengths, etc. In someimplementations, the audio characteristics of the driving event soundmay include changes in frequency or changes in wavelength over timewhich may be indicative of a Doppler effect. The Doppler effect mayindicate that the driving event sound is real and came from an externalsource that was moving relative to the vehicle 12. The environmentalcharacteristics may include sensor data from the vehicle 12, such assensor data from cameras within the vehicle 12, tire pressure sensors,vehicle door sensors, seat belt sensors, accelerometers, gyroscopes,positioning sensors, etc. Additionally, the driving event soundcharacteristics may include an indication of the type of driving eventsound, such as a police siren, a fire truck siren, an ambulance siren, avehicle honking, the sound of a vehicle collision, a vehicle malfunctionalarm, etc. Still further, the driving event sound characteristics mayinclude media content characteristics, such as whether a passenger isplaying an electronic game in the vehicle 12, a type of electronic gamebeing played, a name of the electronic game, whether the radio isplaying within the vehicle 12, a name of the song or content currentlybeing played, etc. The media content characteristics may be determinedfrom metadata provided by another application 132 executing on theclient device 10 via an API or by another device 92 via a short-rangecommunication link.

The training data also includes a second driving event sound 402 bhaving a second set of driving event sound characteristics 404 b, and asecond indication of whether the second driving event sound 402 b isreal or artificial 406 b. Furthermore, the training data includes athird driving event sound 402 c having a third set of driving eventsound characteristics 404 c, and a third indication of whether the thirddriving event sound 402 c is real or artificial 406 c. Still further,the training data includes an nth driving event sound 402 n having annth set of driving event sound characteristics 404 n, and an nthindication of whether the nth driving event sound 402 n is real orartificial 406 n.

While the example training data includes four driving event sounds 402a-402 n, this is merely an example for ease of illustration only. Thetraining data may include any number of driving event sounds andcorresponding driving event sound characteristics and indications ofwhether the driving event sound is real or artificial.

The driving event sound ML model generator 68 then analyzes the trainingdata to generate a second machine learning model 410 for determiningwhether a driving event sound is real or artificial. In someimplementations, the driving event sound ML model generator 68 generatesa separate machine learning model for each type of driving event sound.While the second machine learning model 410 is illustrated as a linearregression model, the second machine learning model 410 may be anothertype of regression model such as a logistic regression model, a decisiontree, a neural network, a hyperplane, or any other suitable machinelearning model.

In any event, the driving event sound ML model generator 68 may providethe second machine learning model 410 to the client device 10. Then whenthe driving event sound detector 134 detects a driving event sound 414,such as during a navigation session, the driving event sound detector134 may apply characteristics of the driving event sound 414 as in inputto the second machine learning model 410 to determine whether thedriving event sound 414 is real or artificial 418. Such a determinationcan be provided as an output of the second machine learning model 410.

As described above, machine learning is merely one example technique fordetermining whether a driving event sound is real or artificial. Inother implementations, the driving event sound detector 134 maydetermine a driving event sound is artificial if the source of thedriving event sound is an application executing on the client device 10or another device 14, 92. In yet other implementations, the drivingevent sound detector 134 may determine a driving event sound isartificial by determining a geographic source of the driving eventsound, and comparing the geographic source to the current location ofthe vehicle 12. For example, emergency vehicle sirens in differentcountries may have different audio characteristics. The driving eventsound detector 134 may compare the audio characteristics of the drivingevent sound to audio characteristics of emergency vehicle sirens indifferent countries to determine the region of origin of the drivingevent sound. If the region of origin of the driving event sound differsfrom the current location of the vehicle 12, the driving event sounddetector 134 may determine that the driving event sound is artificial.In other implementations, the driving event sound detector 134 maydetermine a driving event sound is artificial based on a change infrequency of the driving event sound over time. If the frequency of thedriving event sound does not change over time by more than a thresholdamount indicating a Doppler shift, the driving event sound detector 134may determine that the source of the driving event sound is not movingrelative to the vehicle 12, and therefore the driving event sound isartificial. In other implementations, the driving event sound detector134 may determine that the driving event sound is artificial bycomparing the ambient audio fingerprint or an audio fingerprint from theaudio stream included in the audio playback data to audio fingerprintsof predetermined artificial driving event sounds. When there is a match,the driving event sound detector 134 may determine that the audioincludes an artificial driving event sound. The driving event sounddetector 134 may also compare the ambient audio fingerprint or an audiofingerprint from the audio stream included in the audio playback data toaudio fingerprints of predetermined real driving event sounds. Whenthere is a match, the driving event sound detector 134 may determinethat the audio includes a real driving event sound.

When a driving event sound is detected and determined to be real orartificial, the driving event sound detector 134 may provide anotification to the driver indicating whether the driving event sound isreal or artificial. The notification may be presented on the display ofthe client device 10 or may be an audio notification presented via thespeakers of the client device 10 or the vehicle head unit 14. FIGS.5A-5C illustrate example navigation displays 500-560 which includenotifications to the driver in response to detecting a driving eventsound. As shown in the example navigation display 500 of FIG. 5A, avisual notification 510 may be presented as a banner within thenavigation display 500. The visual notification 510 indicates that thedriving event sound is artificial by stating, “That car horn sound ispart of the media stream.” In some implementations, the visualnotification 510 may further provide an instruction to ignore thedriving event sound. In addition or as an alternative to the visualnotification 510, the client device 10 may present an audio notification512, via the speakers, indicating that the car horn sound is part of themedia stream.

In other implementations, the driving event sound detector 134 maynotify the driver that the driving event sound is artificial using anearcon. An earcon is a brief, distinctive sound that represents aspecific event, such as the arrival of an electronic mail message. FIG.5B illustrates an example navigation display 530 including an audionotification 532 in the form of an earcon. The earcon may be aparticular sound such as a beep, a long beep, or a set of beeps thatsignals to the driver that the driving event sound is artificial. Theearcon may be generated so that it is distinct from the driving eventsounds, such that the driver will not mistake the earcon for acontinuation of the driving event sound.

FIG. 5C illustrates another example navigation display 560 presentedwhen the driving event sound is real. In this scenario, the drivingevent sound detector 134 may provide a visual notification 570 to thedriver indicating that the driving event sound is real and instructingthe driver to be alert for the emergency vehicle indicated by thedriving event sound. The visual notification 570 may be presented as abanner within the navigation display 560. In some implementations, thevisual notification 570 may further provide an instruction on how torespond to the real driving event sound, such as to pull over. Inaddition or as an alternative to the visual notification 570, the clientdevice 10 may present an audio notification 572, via the speakers,indicating that the driver should be alert for an emergency vehicle. Aswould be understood, any type of visual notification 570 may be providedas a confirmation that the driving event sound is real.

Example Method for Identifying Driving Event Sounds

FIG. 6 illustrates a flow diagram of an example method 600 foridentifying driving event sounds during a navigation session. The methodcan be implemented in a set of instructions stored on acomputer-readable memory and executable at one or more processors of theclient device 10 within the vehicle 12. For example, the method can beimplemented by the driving event sound detector 134 and/or the mappingapplication 122.

At block 602, a set of navigation instructions are provided from astarting location to a destination location. For example, when a usersuch as the driver requests navigation directions from a startinglocation to a destination location, the mapping application 122 mayprovide the request to a navigation data server 34. The navigation dataserver 34 may then provide a set of navigation directions to the clientdevice 10 which may be presented by the mapping application 122.

Then at block 604, the driving event sound detector 134 may identifyaudio within or around the vehicle 12 that includes a driving eventsound. More specifically, the driving event sound detector 134 mayobtain audio playback data by communicating with other applications 132executing on the client device 10 (e.g., via an API), or other deviceswithin the vicinity of the client device 10 (e.g., a short-rangecommunication link), such as the vehicle head unit 14 or other clientdevices 92. The driving event sound detector 134 may obtain an audiostream from the audio playback data. The driving event sound detector134 may also capture an ambient audio fingerprint of the audio withinthe area via a microphone, for example. The driving event sound detector134 may then compare the ambient audio fingerprint or an audiofingerprint from the audio stream included in the audio playback data toaudio fingerprints of predetermined driving event sounds. When there isa match, the driving event sound detector 134 may determine that theaudio includes a driving event sound. In other implementations, thedriving event sound detector 134 may apply the ambient audio features oraudio features from the audio stream to a trained machine learning modelfor identifying driving event sounds, such as the first machine learningmodel 310 as shown in FIG. 3 .

In response to identifying audio within or around the vehicle 12 thatincludes a driving event sound, the driving event sound detector 134 maydetermine whether the driving event sound is real or artificial (block606). More specifically, the driving event sound detector 134 mayidentify characteristics of the driving event sound, such as audiocharacteristics, environmental characteristics at the vehicle 12 at thetime of the driving event sound, the type of driving event sound, etc.The driving event sound detector 134 may apply the characteristics ofthe driving event sound to a trained machine learning model fordetermining whether a driving event sound is real or artificial, such asthe second machine learning model 410 as shown in FIG. 4 .

In other implementations, the driving event sound detector 134 maydetermine a driving event sound is artificial if the source of thedriving event sound is an application executing on the client device 10or another device 14, 92. In yet other implementations, the drivingevent sound detector 134 may determine a driving event sound isartificial by determining a geographic source of the driving eventsound, and comparing the geographic source to the current location ofthe vehicle 12. If the region of origin of the driving event sounddiffers from the current location of the vehicle 12, the driving eventsound detector 134 may determine that the driving event sound isartificial. In other implementations, the driving event sound detector134 may determine a driving event sound is artificial based on a changein frequency of the driving event sound over time. If the frequency ofthe driving event sound does not change over time by more than athreshold amount indicating a Doppler shift, the driving event sounddetector 134 may determine that the source of the driving event sound isnot moving relative to the vehicle 12, and therefore the driving eventsound is artificial.

In yet other implementations, the driving event sound detector 134 maydetermine that the driving event sound is artificial by comparing theambient audio fingerprint or an audio fingerprint from the audio streamincluded in the audio playback data to audio fingerprints ofpredetermined artificial driving event sounds. When there is a match,the driving event sound detector 134 may determine that the audioincludes an artificial driving event sound. The driving event sounddetector 134 may also compare the ambient audio fingerprint or an audiofingerprint from the audio stream included in the audio playback data toaudio fingerprints of predetermined real driving event sounds. Whenthere is a match, the driving event sound detector 134 may determinethat the audio includes a real driving event sound.

If the driving event sound is real, the driving event sound detector 134may alert the driver to respond to the driving event sound (block 610)or may otherwise confirm that the driving event sound is real. Forexample, the driving event sound detector 134 may provide a visual oraudio notification to the driver, such as the notifications 570, 572 asshown in FIG. 5C, indicating that the driving event sound is real. Thenotification may also instruct the driver to respond appropriately tothe driver event sound or may provide explicit instructions on how torespond to the driver event sound, such as pull over, slow down, callfor help, take the vehicle in for service, etc.

On the other hand, if the driving event sound is artificial, the drivingevent sound detector 134 presents a notification to the driver or masksthe driving event sound to substantially prevent the driver from hearingthe driving event sound and being unnecessarily distracted (block 612).The driving event sound detector 134 may provide a visual or audionotification to the driver, such as the notifications 510, 512 as shownin FIG. 5A, indicating that the driving event sound is artificial. Thenotification may also instruct the driver to ignore the driving eventsound.

The driving event sound detector 134 may mask the driving event sound bymuting or decreasing the volume on the audio during the driving eventsound. For example, when the driving event sound is provided by anapplication 132 executing on the client device 10, the driving eventsound detector 134 may communicate with the application 132 via an APIto instruct the application 132 to decrease or mute the volume on theaudio during the driving event sound. When the driving event sound isprovided by another device 14, 92 communicatively coupled to the clientdevice 10, the driving event sound detector 134 may communicate with theother device 14, 92 via a short-range communication link to transmit arequest to the other device 14, 92 to decrease or mute the volume on theaudio during the driving event sound.

In some implementations, the driving event sound detector 134 determinesto mute or decrease the volume proactively before the driving eventsound is played. For example, the driving event sound detector 134 maybe able to proactively determine to mute or decrease the volume for theartificial driving event sound, when the artificial driving event soundis identified through metadata describing audio content before it hasbeen played. Beneficially, this prevents the driver from hearing anyportion of the artificial driving event sound. In other implementations,the driving event sound detector 134 determines to mute or decrease thevolume for the driving event sound as the driving event sound is playedto prevent the driver from hearing at least a portion of the drivingevent sound.

In other implementations, the driving event sound detector 134 may maskthe driving event sound by filtering the driving event sound from theaudio stream. More specifically, the driving event sound detector 134may cause a filter to be provided to the audio stream, such as abandpass filter or a machine learning model for filtering driving eventsounds. The driving event sound detector 134 may provide the filter tothe other application 132 executing on the client device 10 or the otherdevice 14, 92 communicatively coupled to the client device 10 to filterthe driving event sound from the audio stream.

In some implementations, the driving event sound detector 134 performsthe filtering proactively before the driving event sound is played. Forexample, the driving event sound detector 134 may be able to proactivelyfilter the artificial driving event sound, when the artificial drivingevent sound is identified through metadata describing audio contentbefore it has been played. In other implementations, the driving eventsound detector 134 performs the filtering as the driving event sound isplayed to filter at least a portion of the driving event sound.

In yet other implementations, the driving event sound detector 134 maymask the driving event sound by emitting a noise cancelling sound viathe speakers of the client device 10 or causing the speakers within thevehicle 12 to emit the noise cancelling sound to destructively interferewith the driving event sound and muffle or remove the driving eventsound. The noise cancelling sound may have the same or a similaramplitude/frequency as the driving event sound and an inverted phasefrom the phase of the driving event sound.

The driving event sound detector 134 may determine the amplitude andphase of the driving event sound based on the characteristics of theaudio stream that includes the driving event sound. In otherimplementations, the driving event sound detector 134 may determine theamplitude and phase of the driving event sound based on a set ofpredetermined characteristics for a particular type of driving eventsound (e.g., an ambulance siren).

Then the driving event sound detector 134 may generate the noisecancelling sound by generating an audio signal with the same or asimilar amplitude as the driving event sound and an inverted phase fromthe phase of the driving event sound. The driving event sound detector134 may then play the noise cancelling sound via the speakers of theclient device 10 or transmit an indication of the noise cancelling soundto the vehicle head unit 14 to play the noise cancelling sound via thespeakers 26 within the vehicle 12.

Additional Considerations

The following additional considerations apply to the foregoingdiscussion. Throughout this specification, plural instances mayimplement components, operations, or structures described as a singleinstance. Although individual operations of one or more methods areillustrated and described as separate operations, one or more of theindividual operations may be performed concurrently, and nothingrequires that the operations be performed in the order illustrated.Structures and functionality presented as separate components in exampleconfigurations may be implemented as a combined structure or component.Similarly, structures and functionality presented as a single componentmay be implemented as separate components. These and other variations,modifications, additions, and improvements fall within the scope of thesubject matter of the present disclosure.

Additionally, certain embodiments are described herein as includinglogic or a number of components, modules, or mechanisms. Modules mayconstitute either software modules (e.g., code stored on amachine-readable medium) or hardware modules. A hardware module istangible unit capable of performing certain operations and may beconfigured or arranged in a certain manner. In example embodiments, oneor more computer systems (e.g., a standalone, client or server computersystem) or one or more hardware modules of a computer system (e.g., aprocessor or a group of processors) may be configured by software (e.g.,an application or application portion) as a hardware module thatoperates to perform certain operations as described herein.

In various embodiments, a hardware module may be implementedmechanically or electronically. For example, a hardware module maycomprise dedicated circuitry or logic that is permanently configured(e.g., as a special-purpose processor, such as a field programmable gatearray (FPGA) or an application-specific integrated circuit (ASIC)) toperform certain operations. A hardware module may also compriseprogrammable logic or circuitry (e.g., as encompassed within ageneral-purpose processor or other programmable processor) that istemporarily configured by software to perform certain operations. Itwill be appreciated that the decision to implement a hardware modulemechanically, in dedicated and permanently configured circuitry, or intemporarily configured circuitry (e.g., configured by software) may bedriven by cost and time considerations.

Accordingly, the term hardware should be understood to encompass atangible entity, be that an entity that is physically constructed,permanently configured (e.g., hardwired), or temporarily configured(e.g., programmed) to operate in a certain manner or to perform certainoperations described herein. As used herein “hardware-implementedmodule” refers to a hardware module. Considering embodiments in whichhardware modules are temporarily configured (e.g., programmed), each ofthe hardware modules need not be configured or instantiated at any oneinstance in time. For example, where the hardware modules comprise ageneral-purpose processor configured using software, the general-purposeprocessor may be configured as respective different hardware modules atdifferent times. Software may accordingly configure a processor, forexample, to constitute a particular hardware module at one instance oftime and to constitute a different hardware module at a differentinstance of time.

Hardware modules can provide information to, and receive informationfrom, other hardware. Accordingly, the described hardware modules may beregarded as being communicatively coupled. Where multiple of suchhardware modules exist contemporaneously, communications may be achievedthrough signal transmission (e.g., over appropriate circuits and buses)that connect the hardware modules. In embodiments in which multiplehardware modules are configured or instantiated at different times,communications between such hardware modules may be achieved, forexample, through the storage and retrieval of information in memorystructures to which the multiple hardware modules have access. Forexample, one hardware module may perform an operation and store theoutput of that operation in a memory device to which it iscommunicatively coupled. A further hardware module may then, at a latertime, access the memory device to retrieve and process the storedoutput. Hardware modules may also initiate communications with input oroutput devices, and can operate on a resource (e.g., a collection ofinformation).

The method 600 may include one or more function blocks, modules,individual functions or routines in the form of tangiblecomputer-executable instructions that are stored in a non-transitorycomputer-readable storage medium and executed using a processor of acomputing device (e.g., a server device, a personal computer, a smartphone, a tablet computer, a smart watch, a mobile computing device, orother client device, as described herein). The method 600 may beincluded as part of any backend server (e.g., a map data server, anavigation server, or any other type of server computing device, asdescribed herein), client device modules of the example environment, forexample, or as part of a module that is external to such an environment.Though the figures may be described with reference to the other figuresfor ease of explanation, the method 600 can be utilized with otherobjects and user interfaces. Furthermore, although the explanation abovedescribes steps of the method 600 being performed by specific devices(such as a server device 60 or client device 10), this is done forillustration purposes only. The blocks of the method 600 may beperformed by one or more devices or other parts of the environment.

The various operations of example methods described herein may beperformed, at least partially, by one or more processors that aretemporarily configured (e.g., by software) or permanently configured toperform the relevant operations. Whether temporarily or permanentlyconfigured, such processors may constitute processor-implemented modulesthat operate to perform one or more operations or functions. The modulesreferred to herein may, in some example embodiments, compriseprocessor-implemented modules.

Similarly, the methods or routines described herein may be at leastpartially processor-implemented. For example, at least some of theoperations of a method may be performed by one or more processors orprocessor-implemented hardware modules. The performance of certain ofthe operations may be distributed among the one or more processors, notonly residing within a single machine, but deployed across a number ofmachines. In some example embodiments, the processor or processors maybe located in a single location (e.g., within a home environment, anoffice environment or as a server farm), while in other embodiments theprocessors may be distributed across a number of locations.

The one or more processors may also operate to support performance ofthe relevant operations in a “cloud computing” environment or as anSaaS. For example, as indicated above, at least some of the operationsmay be performed by a group of computers (as examples of machinesincluding processors), these operations being accessible via a network(e.g., the Internet) and via one or more appropriate interfaces (e.g.,APIs).

Still further, the figures depict some embodiments of the exampleenvironment for purposes of illustration only. One skilled in the artwill readily recognize from the following discussion that alternativeembodiments of the structures and methods illustrated herein may beemployed without departing from the principles described herein.

Upon reading this disclosure, those of skill in the art will appreciatestill additional alternative structural and functional designs foridentifying driving event sounds through the disclosed principlesherein. Thus, while particular embodiments and applications have beenillustrated and described, it is to be understood that the disclosedembodiments are not limited to the precise construction and componentsdisclosed herein. Various modifications, changes and variations, whichwill be apparent to those skilled in the art, may be made in thearrangement, operation and details of the method and apparatus disclosedherein without departing from the spirit and scope defined in theappended claims.

What is claimed is:
 1. A method for training a machine learning model todetermine whether a driving event sound is real or artificial, themethod comprising: obtaining, by one or more processors, a set ofdriving event sound characteristics for each of a plurality of drivingevent sounds; for each driving event sound in the plurality of drivingevent sounds, obtaining, by the one or more processors, an indication ofwhether the driving event sound is from a real or artificial source; andtraining, by the one or more processors, a machine learning model todetermine whether a driving event sound is real or artificial using (i)the set of driving event sound characteristics corresponding to eachdriving event sound, and (ii) the indication of whether each drivingevent sound is from the real or artificial source, wherein each set ofdriving event sound characteristics is classified according to whetherthe set corresponds to one of the plurality of driving event sounds fromthe real source or from the artificial source.
 2. The method of claim 1,further comprising: obtaining, by the one or more processors, audioplayback data from an application executing on a client device;obtaining, by the one or more processors, audio playback data from adevice communicatively coupled to the client device; or obtaining, bythe one or more processors, ambient audio.
 3. The method of claim 2,further comprising: applying, by the one or more processors, the audioplayback data from the application or the device or the ambient audio tothe machine learning model to determine whether a driving event sound inthe audio is artificial.
 4. The method of claim 3, further comprising:determining, by the one or more processors, that the audio includes thedriving event sound.
 5. The method of claim 4, wherein determining thatthe audio includes the driving event sound includes: comparing, by theone or more processors, the audio playback data from the application orthe device or audio fingerprints included in the ambient audio to one ormore audio fingerprints of predetermined driving event sounds.
 6. Themethod of claim 4, wherein the machine learning model is a first machinelearning model and further comprising: training a second machinelearning model using (i) a set of audio streams, and (ii) an indicationof the driving event sound corresponding to at least some of the audiostreams in the set of audio streams.
 7. The method of claim 6, whereindetermining that the audio includes the driving event sound includes:applying, by the one or more processors, the audio playback data fromthe application or the device or the ambient audio to the second machinelearning model to determine whether the audio includes the driving eventsound.
 8. The method of claim 1, further comprising: providing, by theone or more processors, the trained machine learning model to a clientdevice for the client device to determine whether a driving event soundis artificial.
 9. A server device for training a machine learning modelto determine whether a driving event sound is real or artificial, theserver device comprising: one or more processors; and a non-transitorycomputer-readable memory coupled to the one or more processors andstoring instructions thereon that, when executed by the one or moreprocessors, cause the server device to: obtain a set of driving eventsound characteristics for each of a plurality of driving event sounds;for each driving event sound in the plurality of driving event sounds,obtain an indication of whether the driving event sound is from a realor artificial source; and train a machine learning model to determinewhether a driving event sound is real or artificial using (i) the set ofdriving event sound characteristics corresponding to each driving eventsound, and (ii) the indication of whether each driving event sound isfrom the real or artificial source, wherein each set of driving eventsound characteristics is classified according to whether the setcorresponds to one of the plurality of driving event sounds from thereal source or from the artificial source.
 10. The server device ofclaim 9, wherein the instructions further cause the server device to:obtain audio playback data from an application executing on a clientdevice; obtain audio playback data from a device communicatively coupledto the client device; or obtain ambient audio.
 11. The server device ofclaim 10, wherein the instructions further cause the server device to:apply the audio playback data from the application or the device or theambient audio to the machine learning model to determine whether adriving event sound in the audio is artificial.
 12. The server device ofclaim 11, wherein the instructions further cause the server device to:determine that the audio includes the driving event sound.
 13. Theserver device of claim 12, wherein to determine that the audio includesthe driving event sound, the instructions cause the server device to:compare the audio playback data from the application or the device oraudio fingerprints included in the ambient audio to one or more audiofingerprints of predetermined driving event sounds.
 14. The serverdevice of claim 12, wherein the machine learning model is a firstmachine learning model and the instructions further cause the serverdevice to: train a second machine learning model using (i) a set ofaudio streams, and (ii) an indication of the driving event soundcorresponding to at least some of the audio streams in the set of audiostreams.
 15. The server device of claim 14, wherein to determine thatthe audio includes the driving event sound, the instructions cause theserver device to: apply the audio playback data from the application orthe device or the ambient audio to the second machine learning model todetermine whether the audio includes the driving event sound.
 16. Theserver device of claim 9, wherein the instructions further cause theserver device to: provide the trained machine learning model to a clientdevice for the client device to determine whether a driving event soundis artificial.
 17. A non-transitory computer-readable memory coupled toone or more processors and storing instructions thereon that, whenexecuted by the one or more processors, cause the one or more processorsto: obtain a set of driving event sound characteristics for each of aplurality of driving event sounds; for each driving event sound in theplurality of driving event sounds, obtain an indication of whether thedriving event sound is from a real or artificial source; and train amachine learning model to determine whether a driving event sound isreal or artificial using (i) the set of driving event soundcharacteristics corresponding to each driving event sound, and (ii) theindication of whether each driving event sound is from the real orartificial source, wherein each set of driving event soundcharacteristics is classified according to whether the set correspondsto one of the plurality of driving event sounds from the real source orfrom the artificial source.
 18. The non-transitory computer-readablememory of claim 17, wherein the instructions further cause the one ormore processors to: obtain audio playback data from an applicationexecuting on a client device; obtain audio playback data from a devicecommunicatively coupled to the client device; or obtain ambient audio.19. The non-transitory computer-readable memory of claim 18, wherein theinstructions further cause the one or more processors to: apply theaudio playback data from the application or the device or the ambientaudio to the machine learning model to determine whether a driving eventsound in the audio is artificial.
 20. The non-transitorycomputer-readable memory of claim 19, wherein the instructions furthercause the one or more processors to: determine that the audio includesthe driving event sound.